I discovered a strange "bug" in the HDSoC. If you never set the channels before taking data like so:
readout_controller.attr("set_readout_channels")(py_channels); // Pass the Python list to the function
readout_controller.attr("set_readout_channels")(py_channels); // Pass the Python list to the function
The board will still spit out data, but in a very strange way. All the packets will have the correct structure, but "later" in the event the data payload will slowly "fade" to zeros. I.e. the first packets in the event look right, but later the data slowly becomes very 0 dominant. The header and trailer information stays correct. Furthermore, at higher window sizes, the timing for the events becomes impossible to form correct events.
Regardless, I fixed the bug where I wasn't setting this every time and everything appears to be working as expected.
I started a rate test exploring the (external trigger rate, number of channels, number of windows) parameter space. (size(unique rates), size(unique number of channels), size(unique number of windows)) = (101, 6, 7) --> 101\cdot6\cdot7 = 4242 samples (aka runs). The boundaries of the parameter spaces are
rates: [100, 30000] Hz
channels: [1,32]
windows: [1,62]
These spaces are sampled as follow:
rates: [100, 150, 200, ..., 1000, 1250, 1500, ..., 10000, 10500, 11000, ..., 30000]
(i.e. we move in 20 steps of 50, then 40 steps of 250, then 40 steps of 500)
channels: [1,2,4,8,16,32]
windows: [1,2,4,8,16,32,62]
I use python scripts in /home/pioneer/packages/experiments/atar_daq/scripts/python_sequencer_helpers/ (and some ODB tricks) to sample in this fashion.
For each run/sample, I wait 5 seconds before recorded the rate. It takes at least 3 seconds in principle (that's the ODB referesh rate, maybe we can change this somewhere?) but I add 2 seconds just to be safe. The first measurement is very slightly unstable (off by ~1-5% of the requested rate) because of some startup overhead. But this is a price we must pay for speed, otherwise we increase the data taking time.
We can conservatively estimate the data taking time as:
T = \text{number of runs} \cdot 20 \text{ seconds}
Because the transition between starting and stopping a run takes some time. More realistically this is probably closer to 10 seconds per run. In any event, our estimate for the total time of this run is
T = 4242 \cdot 20 \text{ seconds} = 84840 \text{ seconds} = 0.98 \text{ days}
So it will take about a day to run (assuming no errors).
If the sequencer completes, we will show the system is very stable! I'm hopeful.
/home/pioneer/packages/experiments/atar_daq/rate_data
{ "/Equipment/HDSoC-00/Settings/pwm": { "pwm_frequency_hz": 100, "pwm_pulse_width_ns": 10000 }, "/Equipment/HDSoC-00/Statistics": { "Events sent": 294, "Events per sec.": 97.97, "kBytes per sec.": 21.945 }, "/Equipment/HDSoC-00/Settings/nalu_board_controller/nalu_capture": { "lookback": 1, "lookback_mode": "", "trigger_mode": "ext", "windows": 1, "write_after_trig": 1, "active_channels": [ 0 ] }, "/Runinfo": { "Run number": 60, "Start time": "Fri Mar 7 08:28:22 2025", "Start time binary": 1741354102, "Stop time": "Fri Mar 7 08:27:17 2025", "Stop time binary": 0 } }
{
"/Equipment/HDSoC-00/Settings/pwm": {
"pwm_frequency_hz": 100,
"pwm_pulse_width_ns": 10000
},
"/Equipment/HDSoC-00/Statistics": {
"Events sent": 294,
"Events per sec.": 97.97,
"kBytes per sec.": 21.945
},
"/Equipment/HDSoC-00/Settings/nalu_board_controller/nalu_capture": {
"lookback": 1,
"lookback_mode": "",
"trigger_mode": "ext",
"windows": 1,
"write_after_trig": 1,
"active_channels": [
0
]
},
"/Runinfo": {
"Run number": 60,
"Start time": "Fri Mar 7 08:28:22 2025",
"Start time binary": 1741354102,
"Stop time": "Fri Mar 7 08:27:17 2025",
"Stop time binary": 0
}
}
/home/pioneer/packages/data/atar/userfiles/sequencer/nalu_rate_test.msl
ODBCREATE /Sequencer/Helpers/rate_index, UINT8 ODBCREATE /Sequencer/Helpers/num_channels_index, UINT8 ODBCREATE /Sequencer/Helpers/num_windows_index, UINT8 ODBCREATE /Sequencer/Helpers/num_channels, UINT8 LOOP rate_index, 101 ODBSET /Sequencer/Helpers/rate_index, $rate_index SCRIPT /home/pioneer/packages/experiments/atar_daq/scripts/python_sequencer_helpers/get_rate.py rate = $SCRIPT_RESULT ODBSET /Equipment/HDSoC-00/Settings/pwm/pwm_frequency_hz, $rate num_channels = 1 LOOP num_channels_index, 6 ODBSET /Sequencer/Helpers/num_channels_index, $num_channels_index SCRIPT /home/pioneer/packages/experiments/atar_daq/scripts/python_sequencer_helpers/get_num_channels.py num_channels = $SCRIPT_RESULT ODBSET /Sequencer/Helpers/num_channels, $num_channels SCRIPT /home/pioneer/packages/experiments/atar_daq/scripts/python_sequencer_helpers/set_enabled_channels.py num_windows = 1 LOOP num_windows_index, 7 ODBSET /Sequencer/Helpers/num_windows_index, $num_windows_index SCRIPT /home/pioneer/packages/experiments/atar_daq/scripts/python_sequencer_helpers/get_num_windows.py num_windows = $SCRIPT_RESULT ODBSET /Equipment/HDSoC-00/Settings/nalu_board_controller/nalu_capture/lookback, $num_windows ODBSET /Equipment/HDSoC-00/Settings/nalu_board_controller/nalu_capture/write_after_trig, $num_windows ODBSET /Equipment/HDSoC-00/Settings/nalu_board_controller/nalu_capture/windows, $num_windows TRANSITION START WAIT Seconds, 5 SCRIPT /home/pioneer/packages/experiments/atar_daq/scripts/python_sequencer_helpers/save_odb_rate_info.py TRANSITION STOP ENDLOOP ENDLOOP ENDLOOP ODBDELETE /Sequencer/Helpers/rate_index ODBDELETE /Sequencer/Helpers/num_channels_index ODBDELETE /Sequencer/Helpers/num_windows_index ODBDELETE /Sequencer/Helpers/num_channels
ODBCREATE /Sequencer/Helpers/rate_index, UINT8
ODBCREATE /Sequencer/Helpers/num_channels_index, UINT8
ODBCREATE /Sequencer/Helpers/num_windows_index, UINT8
ODBCREATE /Sequencer/Helpers/num_channels, UINT8
LOOP rate_index, 101
ODBSET /Sequencer/Helpers/rate_index, $rate_index
SCRIPT /home/pioneer/packages/experiments/atar_daq/scripts/python_sequencer_helpers/get_rate.py
rate = $SCRIPT_RESULT
ODBSET /Equipment/HDSoC-00/Settings/pwm/pwm_frequency_hz, $rate
num_channels = 1
LOOP num_channels_index, 6
ODBSET /Sequencer/Helpers/num_channels_index, $num_channels_index
SCRIPT /home/pioneer/packages/experiments/atar_daq/scripts/python_sequencer_helpers/get_num_channels.py
num_channels = $SCRIPT_RESULT
ODBSET /Sequencer/Helpers/num_channels, $num_channels
SCRIPT /home/pioneer/packages/experiments/atar_daq/scripts/python_sequencer_helpers/set_enabled_channels.py
num_windows = 1
LOOP num_windows_index, 7
ODBSET /Sequencer/Helpers/num_windows_index, $num_windows_index
SCRIPT /home/pioneer/packages/experiments/atar_daq/scripts/python_sequencer_helpers/get_num_windows.py
num_windows = $SCRIPT_RESULT
ODBSET /Equipment/HDSoC-00/Settings/nalu_board_controller/nalu_capture/lookback, $num_windows
ODBSET /Equipment/HDSoC-00/Settings/nalu_board_controller/nalu_capture/write_after_trig, $num_windows
ODBSET /Equipment/HDSoC-00/Settings/nalu_board_controller/nalu_capture/windows, $num_windows
TRANSITION START
WAIT Seconds, 5
SCRIPT /home/pioneer/packages/experiments/atar_daq/scripts/python_sequencer_helpers/save_odb_rate_info.py
TRANSITION STOP
ENDLOOP
ENDLOOP
ENDLOOP
ODBDELETE /Sequencer/Helpers/rate_index
ODBDELETE /Sequencer/Helpers/num_channels_index
ODBDELETE /Sequencer/Helpers/num_windows_index
ODBDELETE /Sequencer/Helpers/num_channels
My previous entry had some bad math, here is the corrected entry:
I started a rate test exploring the (external trigger rate, number of channels, number of windows) parameter space. (size(unique rates), size(unique number of channels), size(unique number of windows)) = (95, 6, 7) --> 95\cdot6\cdot7 = 3990 samples (aka runs). The boundaries of the parameter spaces are
rates: [100, 30000] Hz
channels: [1,32]
windows: [1,62]
These spaces are sampled as follow:
rates: [100, 150, 200, ..., 1000, 1250, 1500, ..., 10000, 10500, 11000, ..., 30000]
(i.e. we move in 18 steps of 50, then 36 steps of 250, then 40 steps of 500)
channels: [1,2,4,8,16,32]
windows: [1,2,4,8,16,32,62]
I use python scripts in /home/pioneer/packages/experiments/atar_daq/scripts/python_sequencer_helpers/ (and some ODB tricks) to sample in this fashion.
For each run/sample, I wait 5 seconds before recorded the rate. It takes at least 3 seconds in principle (that's the ODB referesh rate, maybe we can change this somewhere?) but I add 2 seconds just to be safe. The first measurement is very slightly unstable (off by ~1-5% of the requested rate) because of some startup overhead. But this is a price we must pay for speed, otherwise we increase the data taking time.
We can conservatively estimate the data taking time as:
T = \text{number of runs} \cdot 20 \text{ seconds}
Because the transition between starting and stopping a run takes some time. More realistically this is probably closer to 10 seconds per run. In any event, our estimate for the total time of this run is
T = 3990 \cdot 20 \text{ seconds} = 79800 \text{ seconds} = 0.92 \text{ days}
So it will take about a day to run (assuming no errors).
If the sequencer completes, we will show the system is very stable! I'm hopeful.
/home/pioneer/packages/experiments/atar_daq/rate_data
{ "/Equipment/HDSoC-00/Settings/pwm": { "pwm_frequency_hz": 100, "pwm_pulse_width_ns": 10000 }, "/Equipment/HDSoC-00/Statistics": { "Events sent": 294, "Events per sec.": 97.97, "kBytes per sec.": 21.945 }, "/Equipment/HDSoC-00/Settings/nalu_board_controller/nalu_capture": { "lookback": 1, "lookback_mode": "", "trigger_mode": "ext", "windows": 1, "write_after_trig": 1, "active_channels": [ 0 ] }, "/Runinfo": { "Run number": 60, "Start time": "Fri Mar 7 08:28:22 2025", "Start time binary": 1741354102, "Stop time": "Fri Mar 7 08:27:17 2025", "Stop time binary": 0 } }
{
"/Equipment/HDSoC-00/Settings/pwm": {
"pwm_frequency_hz": 100,
"pwm_pulse_width_ns": 10000
},
"/Equipment/HDSoC-00/Statistics": {
"Events sent": 294,
"Events per sec.": 97.97,
"kBytes per sec.": 21.945
},
"/Equipment/HDSoC-00/Settings/nalu_board_controller/nalu_capture": {
"lookback": 1,
"lookback_mode": "",
"trigger_mode": "ext",
"windows": 1,
"write_after_trig": 1,
"active_channels": [
0
]
},
"/Runinfo": {
"Run number": 60,
"Start time": "Fri Mar 7 08:28:22 2025",
"Start time binary": 1741354102,
"Stop time": "Fri Mar 7 08:27:17 2025",
"Stop time binary": 0
}
}
/home/pioneer/packages/data/atar/userfiles/sequencer/nalu_rate_test.msl
PARAM start_rate_index, "Index to start rate at (for continuing failed tests)", 1 PARAM start_num_channels_index, "Index to start num_windows at (for continuing failed tests)", 1 PARAM start_num_windows_index, "Index to start num_channels at (for continuing failed tests)", 1 end_rate_index = 95 end_num_channels_index = 6 end_num_windows_index = 7 rate_iterations = $end_rate_index - $start_rate_index num_channels_iterations = $end_num_channels_index - $start_num_channels_index num_windows_iterations = $end_num_windows_index - $start_num_windows_index ODBCREATE /Sequencer/Helpers/rate_index, UINT8 ODBCREATE /Sequencer/Helpers/num_channels_index, UINT8 ODBCREATE /Sequencer/Helpers/num_windows_index, UINT8 ODBCREATE /Sequencer/Helpers/num_channels, UINT8 events_sent = 0 LOOP i, $rate_iterations rate_index = $i + $start_rate_index - 1 ODBSET /Sequencer/Helpers/rate_index, $rate_index SCRIPT /home/pioneer/packages/experiments/atar_daq/scripts/python_sequencer_helpers/get_rate.py rate = $SCRIPT_RESULT ODBSET /Equipment/HDSoC-00/Settings/pwm/pwm_frequency_hz, $rate num_channels = 1 LOOP j, $num_channels_iterations num_channels_index = $j + $start_num_channels_index - 1 ODBSET /Sequencer/Helpers/num_channels_index, $num_channels_index SCRIPT /home/pioneer/packages/experiments/atar_daq/scripts/python_sequencer_helpers/get_num_channels.py num_channels = $SCRIPT_RESULT ODBSET /Sequencer/Helpers/num_channels, $num_channels SCRIPT /home/pioneer/packages/experiments/atar_daq/scripts/python_sequencer_helpers/set_enabled_channels.py num_windows = 1 LOOP k, $num_windows_iterations num_windows_index = $k + $start_num_windows_index - 1 ODBSET /Sequencer/Helpers/num_windows_index, $num_windows_index SCRIPT /home/pioneer/packages/experiments/atar_daq/scripts/python_sequencer_helpers/get_num_windows.py num_windows = $SCRIPT_RESULT ODBSET /Equipment/HDSoC-00/Settings/nalu_board_controller/nalu_capture/lookback, $num_windows ODBSET /Equipment/HDSoC-00/Settings/nalu_board_controller/nalu_capture/write_after_trig, $num_windows ODBSET /Equipment/HDSoC-00/Settings/nalu_board_controller/nalu_capture/windows, $num_windows TRANSITION START WAIT Seconds, 5 ODBGET /Equipment/HDSoC-00/Statistics/Events sent, event_sent IF ($event_sent <= 0) MSG "Sequencer read 0 events sent, exiting prematurely" ODBDELETE /Sequencer/Helpers/rate_index ODBDELETE /Sequencer/Helpers/num_channels_index ODBDELETE /Sequencer/Helpers/num_windows_index ODBDELETE /Sequencer/Helpers/num_channels EXIT ENDIF SCRIPT /home/pioneer/packages/experiments/atar_daq/scripts/python_sequencer_helpers/save_odb_rate_info.py TRANSITION STOP ENDLOOP num_windows_iterations = $end_num_windows_index start_num_windows_index = 1 ENDLOOP num_channels_iterations = $end_num_channels_index start_num_channels_index = 1 ENDLOOP num_rate_iterations = $end_rate start_rate_index = 1 ODBDELETE /Sequencer/Helpers/rate_index ODBDELETE /Sequencer/Helpers/num_channels_index ODBDELETE /Sequencer/Helpers/num_windows_index ODBDELETE /Sequencer/Helpers/num_channels
PARAM start_rate_index, "Index to start rate at (for continuing failed tests)", 1
PARAM start_num_channels_index, "Index to start num_windows at (for continuing failed tests)", 1
PARAM start_num_windows_index, "Index to start num_channels at (for continuing failed tests)", 1
end_rate_index = 95
end_num_channels_index = 6
end_num_windows_index = 7
rate_iterations = $end_rate_index - $start_rate_index
num_channels_iterations = $end_num_channels_index - $start_num_channels_index
num_windows_iterations = $end_num_windows_index - $start_num_windows_index
ODBCREATE /Sequencer/Helpers/rate_index, UINT8
ODBCREATE /Sequencer/Helpers/num_channels_index, UINT8
ODBCREATE /Sequencer/Helpers/num_windows_index, UINT8
ODBCREATE /Sequencer/Helpers/num_channels, UINT8
events_sent = 0
LOOP i, $rate_iterations
rate_index = $i + $start_rate_index - 1
ODBSET /Sequencer/Helpers/rate_index, $rate_index
SCRIPT /home/pioneer/packages/experiments/atar_daq/scripts/python_sequencer_helpers/get_rate.py
rate = $SCRIPT_RESULT
ODBSET /Equipment/HDSoC-00/Settings/pwm/pwm_frequency_hz, $rate
num_channels = 1
LOOP j, $num_channels_iterations
num_channels_index = $j + $start_num_channels_index - 1
ODBSET /Sequencer/Helpers/num_channels_index, $num_channels_index
SCRIPT /home/pioneer/packages/experiments/atar_daq/scripts/python_sequencer_helpers/get_num_channels.py
num_channels = $SCRIPT_RESULT
ODBSET /Sequencer/Helpers/num_channels, $num_channels
SCRIPT /home/pioneer/packages/experiments/atar_daq/scripts/python_sequencer_helpers/set_enabled_channels.py
num_windows = 1
LOOP k, $num_windows_iterations
num_windows_index = $k + $start_num_windows_index - 1
ODBSET /Sequencer/Helpers/num_windows_index, $num_windows_index
SCRIPT /home/pioneer/packages/experiments/atar_daq/scripts/python_sequencer_helpers/get_num_windows.py
num_windows = $SCRIPT_RESULT
ODBSET /Equipment/HDSoC-00/Settings/nalu_board_controller/nalu_capture/lookback, $num_windows
ODBSET /Equipment/HDSoC-00/Settings/nalu_board_controller/nalu_capture/write_after_trig, $num_windows
ODBSET /Equipment/HDSoC-00/Settings/nalu_board_controller/nalu_capture/windows, $num_windows
TRANSITION START
WAIT Seconds, 5
ODBGET /Equipment/HDSoC-00/Statistics/Events sent, event_sent
IF ($event_sent <= 0)
MSG "Sequencer read 0 events sent, exiting prematurely"
ODBDELETE /Sequencer/Helpers/rate_index
ODBDELETE /Sequencer/Helpers/num_channels_index
ODBDELETE /Sequencer/Helpers/num_windows_index
ODBDELETE /Sequencer/Helpers/num_channels
EXIT
ENDIF
SCRIPT /home/pioneer/packages/experiments/atar_daq/scripts/python_sequencer_helpers/save_odb_rate_info.py
TRANSITION STOP
ENDLOOP
num_windows_iterations = $end_num_windows_index
start_num_windows_index = 1
ENDLOOP
num_channels_iterations = $end_num_channels_index
start_num_channels_index = 1
ENDLOOP
num_rate_iterations = $end_rate
start_rate_index = 1
ODBDELETE /Sequencer/Helpers/rate_index
ODBDELETE /Sequencer/Helpers/num_channels_index
ODBDELETE /Sequencer/Helpers/num_windows_index
ODBDELETE /Sequencer/Helpers/num_channels
For some reason, my sequencer runs are getting cut short by these errors:
00:55:42.172 2025/03/08 [Sequencer,INFO] Client 'HDSoC00' on database 'ODB' pid 304009 does not exist and db_cleanup2 called by cm_cleanup removed it 00:55:42.172 2025/03/08 [Sequencer,INFO] Client 'HDSoC00' on 'SYSMSG' removed by cm_cleanup (idle 2.2s, timeout 2s) 00:55:42.129 2025/03/08 [Sequencer,ERROR] [midas.cxx:7423:cm_shutdown,ERROR] Killing and Deleting client 'HDSoC00' pid 304009 00:55:42.129 2025/03/08 [Sequencer,ERROR] [midas.cxx:7420:cm_shutdown,ERROR] Cannot connect to client 'HDSoC00' on host 'localhost', port 36979 00:55:42.129 2025/03/08 [Sequencer,ERROR] [midas.cxx:12195:rpc_client_connect,ERROR] timeout waiting for server reply 00:55:42.129 2025/03/08 [Sequencer,ERROR] [midas.cxx:13590:rpc_client_call,ERROR] call to "HDSoC00" on "localhost" RPC "rc_transition": error, ss_recv_net_command() status 411 00:55:42.129 2025/03/08 [Sequencer,ERROR] [system.cxx:5626:ss_recv_net_command,ERROR] error receiving network command header, see messages 00:55:42.129 2025/03/08 [Sequencer,ERROR] [system.cxx:5578:recv_tcp2,ERROR] unexpected connection error, recv() errno 104 (Connection reset by peer)
00:55:42.172 2025/03/08 [Sequencer,INFO] Client 'HDSoC00' on database 'ODB' pid 304009 does not exist and db_cleanup2 called by cm_cleanup removed it
00:55:42.172 2025/03/08 [Sequencer,INFO] Client 'HDSoC00' on 'SYSMSG' removed by cm_cleanup (idle 2.2s, timeout 2s)
00:55:42.129 2025/03/08 [Sequencer,ERROR] [midas.cxx:7423:cm_shutdown,ERROR] Killing and Deleting client 'HDSoC00' pid 304009
00:55:42.129 2025/03/08 [Sequencer,ERROR] [midas.cxx:7420:cm_shutdown,ERROR] Cannot connect to client 'HDSoC00' on host 'localhost', port 36979
00:55:42.129 2025/03/08 [Sequencer,ERROR] [midas.cxx:12195:rpc_client_connect,ERROR] timeout waiting for server reply
00:55:42.129 2025/03/08 [Sequencer,ERROR] [midas.cxx:13590:rpc_client_call,ERROR] call to "HDSoC00" on "localhost" RPC "rc_transition": error, ss_recv_net_command() status 411
00:55:42.129 2025/03/08 [Sequencer,ERROR] [system.cxx:5626:ss_recv_net_command,ERROR] error receiving network command header, see messages
00:55:42.129 2025/03/08 [Sequencer,ERROR] [system.cxx:5578:recv_tcp2,ERROR] unexpected connection error, recv() errno 104 (Connection reset by peer)
Which looks like it has nothing to do with my DAQ code... Unsure how to fix this? Maybe there's a memory leak, but that's hard to believe because the whole system would bottleneck if htat's the case, and it looks like the sequencer doesn't ever slow down.
I'm wrong, there seems to be something particularly wrong about parameters:
(rate (Hz), num_channels, num_windows) = (3250, 16, 62)
[INFO] Data capture started successfully. Started run 7670 [WARN] Start marker not found at expected position. ./run.sh: line 74: 450315 Killed "$EXECUTABLE" "${EXEC_ARGS[@]}"
[INFO] Data capture started successfully.
Started run 7670
[WARN] Start marker not found at expected position.
./run.sh: line 74: 450315 Killed "$EXECUTABLE" "${EXEC_ARGS[@]}"
This appears to only happen if I start it with the sequencer. I don't even see the start marker warning unless I start with the sequencer. My only theory is somehow the sequencer is messing with the ability to receive UDP data?
The debugger gives no help...
pioneer@pioneer-MS-7D41:~/packages/experiments/atar_daq/scripts$ ./run.sh --debug Running with debugger (gdb)... GNU gdb (Ubuntu 12.1-0ubuntu1~22.04) 12.1 Copyright (C) 2022 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-linux-gnu". Type "show configuration" for configuration details. For bug reporting instructions, please see: <https://www.gnu.org/software/gdb/bugs/>. Find the GDB manual and other documentation resources online at: <http://www.gnu.org/software/gdb/documentation/>. For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from /home/pioneer/packages/experiments/atar_daq/scripts/../bin/frontend... (gdb) run Starting program: /home/pioneer/packages/experiments/atar_daq/bin/frontend -i 0 [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". Frontend name : HDSoC00 Event buffer size : 671088640 User max event size : 134217728 User max frag. size : 671088640 # of events per buffer : 5 Connect to experiment ATAR_DAQ... OK [New Thread 0x7fffef2ca640 (LWP 459507)] [HDSoC00,INFO] Client 'HDSoC00' on buffer 'SYSTEM' removed by bm_open_buffer because process pid 458746 does not exist Init hardware... [DEBUG] Initializing Python interpreter... [DEBUG] Python interpreter initialized. [DEBUG] Importing 'naludaq.board' module... [New Thread 0x7fffe39ff640 (LWP 459508)] [New Thread 0x7fffe31fe640 (LWP 459509)] [New Thread 0x7fffde9fd640 (LWP 459510)] [New Thread 0x7fffde1fc640 (LWP 459511)] [New Thread 0x7fffdb9fb640 (LWP 459512)] [New Thread 0x7fffd91fa640 (LWP 459513)] [New Thread 0x7fffd49f9640 (LWP 459514)] [New Thread 0x7fffd41f8640 (LWP 459515)] [New Thread 0x7fffd19f7640 (LWP 459516)] [New Thread 0x7fffcd1f6640 (LWP 459517)] [New Thread 0x7fffcc9f5640 (LWP 459518)] [New Thread 0x7fffc81f4640 (LWP 459519)] [New Thread 0x7fffc79f3640 (LWP 459520)] [New Thread 0x7fffc51f2640 (LWP 459521)] [New Thread 0x7fffc09f1640 (LWP 459522)] [New Thread 0x7fffbe1f0640 (LWP 459523)] [New Thread 0x7fffbd9ef640 (LWP 459524)] [New Thread 0x7fffb91ee640 (LWP 459525)] [New Thread 0x7fffb89ed640 (LWP 459526)] [New Thread 0x7fffb41ec640 (LWP 459527)] [New Thread 0x7fffb19eb640 (LWP 459528)] [New Thread 0x7fffb11ea640 (LWP 459529)] [New Thread 0x7fffac9e9640 (LWP 459530)] [New Thread 0x7fffa9a8f640 (LWP 459531)] [New Thread 0x7fffa928e640 (LWP 459532)] D3XX import failed: Support for USB 3 on Linux is unavailable [New Thread 0x7fffa89c2640 (LWP 459533)] [New Thread 0x7fffa2851640 (LWP 459534)] [Detaching after vfork from child process 459535] [DEBUG] 'naludaq.board' module imported. [DEBUG] Creating board object... [DEBUG] Board object created. [DEBUG] Getting UDP connection... [DEBUG] UDP connection established. [DEBUG] Resetting board... [DEBUG] Board reset. [DEBUG] Initializing control registers... [DEBUG] Control registers initialized. [DEBUG] Importing 'naludaq.tools.data_collector'... [DEBUG] Getting trigger controller... [DEBUG] Starting up board... [DEBUG] Disconnecting board... [INFO] Board initialization complete. Connected to /dev/ttyACM0 OK [DEBUG] NaluUdpReceiver initialized with address: 192.168.1.1 and port: 12345 [DEBUG] NaluUdpReceiver initialized with parameters from NaluUdpReceiverParams. [DEBUG] NaluEventCollector created. [DEBUG] Starting NaluUdpReceiver... [New Thread 0x7fffa09bc640 (LWP 459747)] [DEBUG] Receiver started. [DEBUG] Capture parameters initialized. [DEBUG] Readout window set: windows=62, lookback=62, write_after_trig=62 [DEBUG] Importing naludaq.controllers... [DEBUG] Retrieving connection controller... [DEBUG] Initializing socket... [DEBUG] Establishing UDP connection between board and host... [DEBUG] Socket initialization completed. [DEBUG] Retrieving readout controller... [DEBUG] Checking if readout channels are provided... [DEBUG] Setting readout channels... [DEBUG] Activating channels: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15] [DEBUG] Configuring readout window... [DEBUG] Setting receiver address to 192.168.1.1:12345 [DEBUG] Configuring Ethernet settings... [DEBUG] Retrieving board controller... [DEBUG] Starting readout with trigger_mode=ext and lookback_mode= [INFO] Data capture started successfully. Started run 7679 [WARN] Start marker not found at expected position. [WARN] Start marker not found at expected position. [WARN] Start marker not found at expected position. [WARN] Start marker not found at expected position. [WARN] Start marker not found at expected position. [WARN] Start marker not found at expected position. [WARN] Start marker not found at expected position. [WARN] Start marker not found at expected position. [WARN] Start marker not found at expected position. [WARN] Start marker not found at expected position. [WARN] Start marker not found at expected position. [WARN] Start marker not found at expected position. [Thread 0x7fffa09bc640 (LWP 459747) exited] [Thread 0x7fffa2851640 (LWP 459534) exited] [Thread 0x7fffa89c2640 (LWP 459533) exited] [Thread 0x7fffa928e640 (LWP 459532) exited] [Thread 0x7fffa9a8f640 (LWP 459531) exited] [Thread 0x7fffac9e9640 (LWP 459530) exited] [Thread 0x7fffb11ea640 (LWP 459529) exited] [Thread 0x7fffb19eb640 (LWP 459528) exited] [Thread 0x7fffb41ec640 (LWP 459527) exited] [Thread 0x7fffb89ed640 (LWP 459526) exited] [Thread 0x7fffb91ee640 (LWP 459525) exited] [Thread 0x7fffbd9ef640 (LWP 459524) exited] [Thread 0x7fffbe1f0640 (LWP 459523) exited] [Thread 0x7fffc09f1640 (LWP 459522) exited] [Thread 0x7fffc51f2640 (LWP 459521) exited] [Thread 0x7fffc79f3640 (LWP 459520) exited] [Thread 0x7fffc81f4640 (LWP 459519) exited] [Thread 0x7fffcc9f5640 (LWP 459518) exited] [Thread 0x7fffcd1f6640 (LWP 459517) exited] [Thread 0x7fffd19f7640 (LWP 459516) exited] [Thread 0x7fffd41f8640 (LWP 459515) exited] [Thread 0x7fffd49f9640 (LWP 459514) exited] [Thread 0x7fffd91fa640 (LWP 459513) exited] [Thread 0x7fffdb9fb640 (LWP 459512) exited] [Thread 0x7fffde1fc640 (LWP 459511) exited] [Thread 0x7fffde9fd640 (LWP 459510) exited] [Thread 0x7fffe31fe640 (LWP 459509) exited] [Thread 0x7fffe39ff640 (LWP 459508) exited] [Thread 0x7fffef2ca640 (LWP 459507) exited] Program terminated with signal SIGKILL, Killed. The program no longer exists. (gdb) backtrace No stack. (gdb)
pioneer@pioneer-MS-7D41:~/packages/experiments/atar_daq/scripts$ ./run.sh --debug
Running with debugger (gdb)...
GNU gdb (Ubuntu 12.1-0ubuntu1~22.04) 12.1
Copyright (C) 2022 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /home/pioneer/packages/experiments/atar_daq/scripts/../bin/frontend...
(gdb) run
Starting program: /home/pioneer/packages/experiments/atar_daq/bin/frontend -i 0
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Frontend name : HDSoC00
Event buffer size : 671088640
User max event size : 134217728
User max frag. size : 671088640
# of events per buffer : 5
Connect to experiment ATAR_DAQ...
OK
[New Thread 0x7fffef2ca640 (LWP 459507)]
[HDSoC00,INFO] Client 'HDSoC00' on buffer 'SYSTEM' removed by bm_open_buffer because process pid 458746 does not exist
Init hardware...
[DEBUG] Initializing Python interpreter...
[DEBUG] Python interpreter initialized.
[DEBUG] Importing 'naludaq.board' module...
[New Thread 0x7fffe39ff640 (LWP 459508)]
[New Thread 0x7fffe31fe640 (LWP 459509)]
[New Thread 0x7fffde9fd640 (LWP 459510)]
[New Thread 0x7fffde1fc640 (LWP 459511)]
[New Thread 0x7fffdb9fb640 (LWP 459512)]
[New Thread 0x7fffd91fa640 (LWP 459513)]
[New Thread 0x7fffd49f9640 (LWP 459514)]
[New Thread 0x7fffd41f8640 (LWP 459515)]
[New Thread 0x7fffd19f7640 (LWP 459516)]
[New Thread 0x7fffcd1f6640 (LWP 459517)]
[New Thread 0x7fffcc9f5640 (LWP 459518)]
[New Thread 0x7fffc81f4640 (LWP 459519)]
[New Thread 0x7fffc79f3640 (LWP 459520)]
[New Thread 0x7fffc51f2640 (LWP 459521)]
[New Thread 0x7fffc09f1640 (LWP 459522)]
[New Thread 0x7fffbe1f0640 (LWP 459523)]
[New Thread 0x7fffbd9ef640 (LWP 459524)]
[New Thread 0x7fffb91ee640 (LWP 459525)]
[New Thread 0x7fffb89ed640 (LWP 459526)]
[New Thread 0x7fffb41ec640 (LWP 459527)]
[New Thread 0x7fffb19eb640 (LWP 459528)]
[New Thread 0x7fffb11ea640 (LWP 459529)]
[New Thread 0x7fffac9e9640 (LWP 459530)]
[New Thread 0x7fffa9a8f640 (LWP 459531)]
[New Thread 0x7fffa928e640 (LWP 459532)]
D3XX import failed: Support for USB 3 on Linux is unavailable
[New Thread 0x7fffa89c2640 (LWP 459533)]
[New Thread 0x7fffa2851640 (LWP 459534)]
[Detaching after vfork from child process 459535]
[DEBUG] 'naludaq.board' module imported.
[DEBUG] Creating board object...
[DEBUG] Board object created.
[DEBUG] Getting UDP connection...
[DEBUG] UDP connection established.
[DEBUG] Resetting board...
[DEBUG] Board reset.
[DEBUG] Initializing control registers...
[DEBUG] Control registers initialized.
[DEBUG] Importing 'naludaq.tools.data_collector'...
[DEBUG] Getting trigger controller...
[DEBUG] Starting up board...
[DEBUG] Disconnecting board...
[INFO] Board initialization complete.
Connected to /dev/ttyACM0
OK
[DEBUG] NaluUdpReceiver initialized with address: 192.168.1.1 and port: 12345
[DEBUG] NaluUdpReceiver initialized with parameters from NaluUdpReceiverParams.
[DEBUG] NaluEventCollector created.
[DEBUG] Starting NaluUdpReceiver...
[New Thread 0x7fffa09bc640 (LWP 459747)]
[DEBUG] Receiver started.
[DEBUG] Capture parameters initialized.
[DEBUG] Readout window set: windows=62, lookback=62, write_after_trig=62
[DEBUG] Importing naludaq.controllers...
[DEBUG] Retrieving connection controller...
[DEBUG] Initializing socket...
[DEBUG] Establishing UDP connection between board and host...
[DEBUG] Socket initialization completed.
[DEBUG] Retrieving readout controller...
[DEBUG] Checking if readout channels are provided...
[DEBUG] Setting readout channels...
[DEBUG] Activating channels: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]
[DEBUG] Configuring readout window...
[DEBUG] Setting receiver address to 192.168.1.1:12345
[DEBUG] Configuring Ethernet settings...
[DEBUG] Retrieving board controller...
[DEBUG] Starting readout with trigger_mode=ext and lookback_mode=
[INFO] Data capture started successfully.
Started run 7679
[WARN] Start marker not found at expected position.
[WARN] Start marker not found at expected position.
[WARN] Start marker not found at expected position.
[WARN] Start marker not found at expected position.
[WARN] Start marker not found at expected position.
[WARN] Start marker not found at expected position.
[WARN] Start marker not found at expected position.
[WARN] Start marker not found at expected position.
[WARN] Start marker not found at expected position.
[WARN] Start marker not found at expected position.
[WARN] Start marker not found at expected position.
[WARN] Start marker not found at expected position.
[Thread 0x7fffa09bc640 (LWP 459747) exited]
[Thread 0x7fffa2851640 (LWP 459534) exited]
[Thread 0x7fffa89c2640 (LWP 459533) exited]
[Thread 0x7fffa928e640 (LWP 459532) exited]
[Thread 0x7fffa9a8f640 (LWP 459531) exited]
[Thread 0x7fffac9e9640 (LWP 459530) exited]
[Thread 0x7fffb11ea640 (LWP 459529) exited]
[Thread 0x7fffb19eb640 (LWP 459528) exited]
[Thread 0x7fffb41ec640 (LWP 459527) exited]
[Thread 0x7fffb89ed640 (LWP 459526) exited]
[Thread 0x7fffb91ee640 (LWP 459525) exited]
[Thread 0x7fffbd9ef640 (LWP 459524) exited]
[Thread 0x7fffbe1f0640 (LWP 459523) exited]
[Thread 0x7fffc09f1640 (LWP 459522) exited]
[Thread 0x7fffc51f2640 (LWP 459521) exited]
[Thread 0x7fffc79f3640 (LWP 459520) exited]
[Thread 0x7fffc81f4640 (LWP 459519) exited]
[Thread 0x7fffcc9f5640 (LWP 459518) exited]
[Thread 0x7fffcd1f6640 (LWP 459517) exited]
[Thread 0x7fffd19f7640 (LWP 459516) exited]
[Thread 0x7fffd41f8640 (LWP 459515) exited]
[Thread 0x7fffd49f9640 (LWP 459514) exited]
[Thread 0x7fffd91fa640 (LWP 459513) exited]
[Thread 0x7fffdb9fb640 (LWP 459512) exited]
[Thread 0x7fffde1fc640 (LWP 459511) exited]
[Thread 0x7fffde9fd640 (LWP 459510) exited]
[Thread 0x7fffe31fe640 (LWP 459509) exited]
[Thread 0x7fffe39ff640 (LWP 459508) exited]
[Thread 0x7fffef2ca640 (LWP 459507) exited]
Program terminated with signal SIGKILL, Killed.
The program no longer exists.
(gdb) backtrace
No stack.
(gdb)
It seems the problem is that the fontend for some reason consumes the entirety of the system RAM, causing it to terminte itself:
Again, this only happens with the sequencer. I.e. running the frontend with parameters
(rate (Hz), num_channels, num_windows) = (3250, 16, 62)
by hand will not crash it. Furthermore, running with parameters
(rate (Hz), num_channels, num_windows) = (3250, 16, 32)
with the sequencer will not crash it, and I see not such memory rising. Furthmore,
(rate (Hz), num_channels, num_windows) = (3250, 32, 62) to (3250, 32, 62)
had no such memory or crashing issues.
I have absolutely no idea why this is happening. Things it shouldn't be:
My only idea to get around this is say "screw the sequencer" and just write a midas frontend that will do what the sequencer does. I.e. 1 continous run, but we pause to change parameters every so often.